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ABSTRACT 

We study the geometry and topology of the large-scale structure traced by 
galaxy clusters in numerical simulations of a box of side 320 h^^ Mpc, and 
compare them with available data on real clusters. The simulations we use are 
generated by the Zel'dovich approximation, using the same methods as we have 
used in the first three papers in this series. We consider the following models 
to see if there are measurable differences in the topology and geometry of the 
superclustering they produce: (i) the standard CDM model (SCDM); (ii) a 
CDM model with fJo = 0.2 (OCDM); (iii) a CDM model with a 'tilted' power 
spectrum having n — 0.7 (TCDM); (iv) a CDM model with a very low Hubble 
constant, h = 0.3 (LOWH); (v) a model with mixed CDM and HDM (CHDM); 
(vi) a flat low-density CDM model with fiQ — 0.2 and a non-zero cosmological 
A term (ACDM). We analyse these models using a variety of statistical tests 
based on the analysis of: (i) the Euler-Poincare characteristic; (ii) percolation 
properties; (iii) the Minimal Spanning Tree construction. Taking all these tests 
together we find that the best fitting model is ACDM and, indeed, the others do 
not appear to be consistent with the data. Our results demonstrate that despite 
their biased and extremely sparse sampling of the cosmological density field, 
it is possible to use clusters to probe subtle statistical diagnostics of models 
which go far beyond the low-order correlation functions usually applied to 
study superclustering. 

Key words: Cosmology: theory - dark matter - galaxies: clustering - large- 
scale structure of Universe 



1 INTRODUCTION 

The study of the distribution of matter on the largest scales amenable to observation can provide important constraints 
on models of the formation of cosmological structures. In particular, it has now become well established that a very 
accurate and efficient way of describing very large scale structure in the galaxy distribution is obtained by not looking 
at galaxies themselves but at rich clusters of galaxies. If the 'standard' model of structure formation - the gravitational 
instability picture - is correct, the expected displacements of galaxy clusters from their primordial positions are much 
smaller than the typical separation of these objects. In principle, therefore, clusters of galaxies can yield clues about 
the primordial spectrum of perturbations that gave rise to them, without such clues being trampled on by the effects 
of non-linear evolution. Moreover, because clusters represent highly overdense regions in the cosmological density 
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field, these objects display an enhanced clustering signal relative to that of galaxies on the same scale, an effect 
usually known as biasing (Kaiser 1984). 

This is the reason why so much efTort has been devoted to compiling deep cluster surveys, starting with the 
pioneering work of AbcU (1958), Zwicky et al. (1968) and Abell, Corwin & Olowin (1989), and leading up to extended 
redshift surveys both in the optical (e.g. Postman, Huchra & Geller 1992; Dalton et al. 1994; Collins et al. 1994, and 
references therein) and in the X-ray (e.g. Nichol, Briel & Henry 1994; Romer et al. 1994; Ebeling et al. 1996) regions 
of the spectrum. 

The properties of galaxy clusters may help to resolve some of the issues that have led to the present relative 
stagnation in the theory of structure formation. Since the demise of the standard model of the 1980s - the standard 
Cold Dark Matter model (SCDM) - a number of contending theories have been proposed which are in better agreement 
with the observations than SCDM but between which it is difficult to discriminate using present observations of galaxy 
clustering and the cosmic microwave background; for a review, sec Coles (1996). It is therefore important to try to 
find statistical diagnostics of clustering that may reveal differences between these models and the data to see if they 
do indeed explain the details of the observed clustering phenomenon, as well as between the models themselves so 
one can understand how the various extra ingredients involved in these models alter specific characteristics of the 
clustering pattern. 

Simple two-point statistical descriptions of superclustering (i.e. the clustering of galaxy clusters) have already 
yielded important clues about the shape of the matter power spectrum on large scales (e.g. Peacock & Dodds 1994; 
Borgani et al. 1997) and, more recently, this has been extended to simple properties of the higher-order moments 
(e.g. Plionis & Valdarnini 1995; Plionis et al. 1995; Borgani et al. 1995; Gaztanaga, Croft & Dalton 1995). However, 
the complete statistical characterisation of the clustering requires knowledge of all the higher order moments or, 
equivalently, knowledge of the complete set of n-point correlation functions (Peebles 1980). Such a description is 
extremely laborious to construct, tends to be swamped by discreteness effects and sampling errors even at quite small 
n and is in any case rather difficult to interpret geometrically. 

For these reasons it is useful to seek a description of clustering which by-p£isses this more orthodox approach and 
looks for intrinsically geometrical or topological signatures. One can hope that such approaches might lead to robust 
quantitative descriptions of the void-filament network which is visually apparent in the distribution of galaxies, and 
to relate this visual appearance to the interaction of non-linear gravitational dynamics on an initial density field 
with some assumed power spectrum. The hope is therefore to pick out differences between models which are hard 
to discern in measures such as the power spectrum. Various approaches to this question have been suggested and 
some of them have been more successful than others in their application to the data. One particular problem such 
descriptors face when they are applied to superclustering, for example, is that these objects are extremely rare and 
there are strong shot-noise effects which have to be compensated for in some way. 

In this paper, we aim to investigate a particular set of topological or geometrical descriptors of the pattern present 
in simulated cluster distributions and, where possible, to compare the results from simulations with the analogous 
results from the Abell/ ACQ cluster catalogue. We should stress at the outset that this is an exploratory work and 
there are reasons to suspect that the task of discriminating between these models and the data might be extremely 
difficult. First there is the problem of shot-noise we alluded to above. Secondly, the available cluster sample is quite 
small and may suffer from uukuowu selection effects. One can hope, however, that better controlled cluster samples 
may emerge fairly soon from ongoing galaxy redshift surveys. Third, it is extremely difficult to construct sufficiently 
large AT-body simulations of galaxy clustering and select the appropriate clusters within them in the same way that 
clusters are selected observationally (e.g. Bahcall & Cen 1992; Croft & Efstathiou 1994; Eke et al. 1996). And finally, 
there is the ubiquitous problem of understanding how the objects one sees relate to the distribution of matter one 
calculates, a difficulty generically known by the name of biasing and which was first discussed in the context of rich 
clusters by Kaiser (1984). 

In the spirit of exploration, therefore, we shall use simplified models of superclustering, generated by using a 
method based on the Zel'dovich approximation. This method has been used in a number of previous studies of the 
distribution of clusters in both position and velocity space (Borgani, Coles & Moscardini 1994; Plionis et al. 1995; 
Borgani et al. 1995; Tini Brunozzi et al. 1995; Moscardini et al. 1996; Borgani et al. 1997) and is known to be accurate 
in comparison with the full iV-body approach, provided the degree of non-linear evolution at the scale of individual 
clusters is not too strong. 
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The outline of the paper is as follows. In Sections 2 and 3 we briefly describe our simulation method and the 
observed Abell/ACO cluster sample, respectively. We then go on to discuss the various clustering descriptors we use 
to analyse these data sets. First, in Section 4, we discuss the topological properties of the isodensity regions in the 
distribution traced by clusters, using a method described in detail by Coles, Davies & Pearson (1996) and which 
is similar (but not identical) to the well-known genus statistic (reviewed by Melott 1990) and which has recently 
been applied to cluster data by Rhoads, Gott & Postman (1994). We next, in Section 5, discuss an analysis based 
on percolation theory. The last of our three approaches, presented in Section 6, is based on properties of a graph- 
theoretical construction known as the minimal spanning tree, in conjunction with a set of mathematical quantities 
intended to describe the shapes of pieces of the trees obtained (Pearson & Coles 1995). Each of the three analyses we 
attempt is expected to perform better in some situations than others, so in Section 7 we present an analysis of the 
statistical power of these tests at discriminating between different models and between the models and the observed 
data. We also discuss the virtues of combining the various tests and show the statistical significance of the results we 
obtain by combining the different analyses into a composite test. We present our conclusions in Section 8. 



2 THE SIMULATIONS 

2.1 The Zel'dovich Approach 

The Zel'dovich approximation [ZA] (Zel'dovich 1970; Shandarin & Zel'dovich 1989) is based on the assumption of 
laminar flow for the motion of a self-gravitating non-relativistic coUisionless fluid. Let q be the initial (Eulerian) 
position of a fluid element and r(q, t) = a(t)x(q, t) the final position at the time t, which is related to the comoving 
Lagrangian coordinate x(q, t) through the cosmic expansion factor a{t). The ZA amounts to assume the expression 

r(q,t) = a{t) [q + b{t) V {q)] (1) 

for the Eulerian-to-Lagrangian coordinate mapping. In equation (|l|) b{t) is the growing mode for the evolution of 
linear density perturbations and V'(ci) is the gravitational potential, which is related to the initial density fluctuation 
field, (5(q), through the Poisson equation 

VV(q) - - ^ ■ (2) 
a[t) 

As a result of the factorization of the t~ and q-dependence in the displacement term of equation (0) , the fluid particles 
move under this approximation along straight lines, with comoving peculiar velocity 

v(q,t) = x(q,t) = b(t)VqV'(q). (3) 

Therefore, gravity determines the initial kick to the fluid particles through eqs.(^) and (|^, and afterwards they do not 
feel any tidal interactions. Particles fall inside gravitational wells to form structures, which however quickly evaporate. 
In this sense, the ZA gives a good description of gravitational dynamics as far as particle trajectories do not intersect 
with each other, while its validity breaks down when shell-crossing occurs, and local gravity dominates. 

Coles, Melott & Shandarin (1993) have shown that filtering out the small-scale wavelength modes in the linear 
power-spectrum reduces the amount of shell-crossing, thus improving the performance of the ZA. Melott, Pellman 
& Shandarin (1993) claimed that an optimal filtering procedure is obtained by convolving the linear power-spectrum 
with the Gaussian filter 

WaikRf) = e-('=-^/^'''2 (4) 

(cf. Sahni & Coles 1995). The problem then arises of choosing the filtering radius Rf appropriately, in order to 
suppress shell-crossing as much as possible without preventing genuine clustering to build up. 

Kofman et al. (1994) derived an analytical expression - their equation (7) - for the average number of streams at 
each Eulerian point, Ns, as a function of the r.m.s. fluctuation level of the initial Gaussian density field. We decided 
to choose Rf for each model so that Na = 1.1. We found this to be a reasonable compromise between smaller Ns 
values, giving rapidly increasing Rf and high suppression of clustering, and larger A^s, at which the ZA progressively 
breaks down. The resulting r.m.s. fluctuation value corresponding to Ns — 1.1 is ct = 0.88. 

By adopting this implementation of the ZA, the main steps of our cluster simulations are the following: 
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(a) Convolve the linear power-spectrum with the Gaussian window of equation and Rf chosen as previously 
described. 

(b) Generate a random-phase realization of the density field on 128^ grid points for a cubic box of L = 320 /i~^Mpc 
aside. 

(c) Move 128"^ particles having initial Lagrangian position on the grid, according to the ZA. Each particle carries a 
mass of 4.4 x lO^^/i"^ Q.o Mq. 

(d) Reassign the density and the velocity field on the grid through a TSC interpolation scheme (e.g. Hockney & 
Eastwood 1981) for the mass and the moment carried by each particle. 

(e) Select clusters as local density maxima on the grid according to the following prescription. If dd is the average 
cluster separation, then we select Nd ~ (L/dd)^ clusters as the Nd highest density peaks. In the following, we assume 
dd = 40/i~^Mpc, which is appropriate for the combined Abell/ACO cluster sample to which we will compare our 
simulation results (see Section 3). Therefore, we will analyze a distribution of 512 clusters in each simulation box, 
with periodic boundary conditions. 

2.2 Dark Matter Models 

We ran simulations for six different models of the initial fiuctuation spectrum. For each model, we generate 5 random 
realizations, as a compromise between estimating the cosmic variance reliably and keeping the amount of data to be 
analysed within reasonable bounds. All the models, except OCDM, are normalized to be consistent with the COBE 
measured quadrupole of CMB temperature anisotropy (Bennett et al. 1994). Since we are primarily interested in 
these simply as tests of the method, we have not attempted to fine-tune the parameters of each scenario in order to 
maximise its performance: the models chosen are simply meant to represent the range of behaviours of contenders 
for a viable model of structure formation. The models we have considered are the following. 

(1) The standard CDM model (SCDM), with ilo = 1, h = 0.5 and cts = 1 for the r.m.s. fiuctuation amplitude 
within a top-hat sphere of 8/i~^Mpc. This model has already been excluded by independent analyses but we include 
here for completeness and to see whether our pattern descriptors can also successfully reject it. 

(2) An open CDM model (OCDM), with fio = 0.2 and n — 1. We have chosen to normalise this model to 6 = 1, 
so that our results in this paper can be compared with Plionis et al. (1995) and Borgani et al. (1995); more detailed 
discussion of this model can be found in (Coles & Ellis 1994,1997; Ratra & Peebles 1994,1995; Liddle et al. 1996; 
Yamamoto & Bunn 1996). 

(3) A tilted CDM model (TCDM), with n = 0.7 for the primordial spectral index. Tilting the primordial spectral 
shape from the scale-free one has been suggested in order to improve the CDM description of the large-scale structure 
(e.g. Cen et al. 1992; Tormen et al. 1993; Liddle & Lyth 1993; Adams et al. 1993; Moscardini et al. 1995). 

(4) A low Hubble constant CDM model (LOWH), with h — 0.3. Decreasing the Hubble constant has the effect of 
increasing the horizon size at the equivalence epoch, thus pushing the turnover of the spectrum to its scale-free form 
out to larger scales (cf. Bartlett et al. 1994). 

(5) A Cold + Hot DM model (CHDM), with Qhot ~ 0.3 for the fractional density contributed by the hot particles. 
For a fixed large-scale normalization, adding a hot component has the effect of suppressing the power-spectrum 
amplitude at small wavelengths (e.g. Klypin et al. 1993). Although the small-scale peculiar velocities are lowered to 
an adequate level, the corresponding galaxy formation time is delayed so that such a model is strongly constrained 
by the detection of high-redshift objects (e.g. Ma & Bertschinger 1994; Klypin et al. 1995; Borgani et al. 1997). 

(6) A spatially fiat, low-density CDM model (ACDM), with flo = 0.2, f^A ~ 0.8 for the cosmological constant term 
(e.g. Efstathiou, Sutherland & Maddox 1990; BahcaU & Cen 1992; Baugh & Efstathiou 1993; Kofman, Gnedin & 
Bahcall 1993; Peacock & Dodds 1994) and ag = 1.3, so as to be consistent with the two-year COBE results. 

The transfer functions for the above models have been taken from Holtzman (1989), except that of LOWH, 
which is taken from Bond & Efstathiou (1984), with suitably chosen shape parameter T = ^loh = 0.3. We note 
that the latter transfer function assumes the baryonic component to be negligible, which is probably not accurate if 
nucleosynthesis is correct, but this would only affect the shape of the transfer function on small scales, below those 
we are interested in here. All the model parameters are listed in Table 0. 

It is worth making a specific point about the ACDM model we use here. Strictly speaking, the amplitude of matter 
fluctuations required for this model to be compatible with COBE is larger than can be treated with great accuracy 
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[tp] 

Table 1. The models. Column 2: the density parameter Qq', Column 3: the cosmological constant term ^2^; Column 4: the 
density parameter of the hot component Sli^o^; Column 5: the primordial spectral index n; Column 6: the Hubble parameter 
h; Column 7: the linear r.m.s. fluctuation amplitude at 8/i~^Mpc erg; Column 8: the filtering radius, Rf, in units of /i~^Mpc, 
corresponding to Ng = 1.1 for the level of orbit crossing. 



Model Qo 

SCDM 1.0 

OCDM 0.2 

TCDM 1.0 

LOWH 1.0 

CHDM 1.0 

ACDM 0.2 







n 


0.0 


0.0 


1.0 


0.0 


0.0 


1.0 


0.0 


0.0 


0.7 


0.0 


0.0 


1.0 


0.0 


0.3 


1.0 


0.8 


0.0 


1.0 



h 


0"8 


Rf 


0.5 


1.0 


4.4 


1.0 


1.0 


4.5 


0.5 


0.5 


1.6 


0.3 


0.6 


2.4 


0.5 


0.7 


2.2 


1.0 


1.3 


6.3 



by our simulation method (Borgani et al. 1995). However, we found in the course of this analysis that this feature of 
ACDM reveals some interesting properties of the topological descriptors we use so, rather than using an alternative 
model with a lower normalisation (c.f. Borgani et al. 1995), we will keep the model with a higher normalisation in 
this analysis. In any case, as we shall show, changing the normalisation from erg — 1.3 to, say, erg = 0.8 does not 
significantly change the large-scale topological properties of selected clusters. 



3 THE CLUSTER SAMPLE 

We use the combined Abell/ACO R> cluster sample, as defined in Plionis & Valdarnini (1995) and in Borgani et 
al. (1995). The declination limit between the northern (Abell) and southern (AGO; Abell, Corwin & Olowin 1989) 
sample is dec > —17° while both samples are limited in Galactic latitude by |6| > 30°. 

To take into account the effect of Galactic absorption, we assume the usual cosecant law: 

P{\b\) =dex [a(l -csc|fe|)] (5) 

with a ~ 0.3 for the Abell sample (Bahcall & Soneira 1983; Postman et al. 1989) and a ~ 0.2 for the AGO sample 
(Batuski et al. 1989). The cluster-redshift selection function, P[z), is determined in the usual way (cf. Postman et al. 
1989), by fitting the cluster density, as a function of z. Gluster distances are estimated using the standard relation: 



R 



H.,Iil + z) L''- + (1^'?=)(1-V2.- + 1)J, (6) 

with Ho = 100 h km sec~^ Mpc~^ and qo — 0,o/2. Strictly speaking, equation ^ holds only for the case of a 
vanishing cosmological constant. Therefore, for a consistent comparison with the simulation models, we should use 
different R-z relations for the Abell/AGO analysis. However, we verified that final results are essentially independent 
of the choice of the (A, flo) parameters used in the simulations. For this reason, in the following we will present results 
for real data only based on assuming equation (^ with qo = 0.2. 

Note that due to the size of our simulations (L — 320 Mpc) we will restrict our analysis within a sphere of 
radius 160 Mpc. Within this volume our Abell/AGO cluster sample is complete {P{z) « 1) containing clusters 
all of which have measured redshifts. The Abell and AGO cluster number densities, corrected for galactic absorption 
according to equation 1^ and within the present sample limits, are ~ 1.7 X 10"^ Mpc'^ and ~ 2.3 x 10"^ 
Mpc""^ respectively. The higher space-density of AGO clusters is partly due to the unique Shapley concentration 
(Shapley 1930; Scaramella et al. 1989), but a part is also due to systematic density differences between the Abell 
and AGO cluster samples which has been noted in a number of studies (cf. Plionis & Valdarnini 1991 and references 
therein) and which could be attributed to the high sensitivity of the HIa-J emulsion plates. In fact, excluding the 
small b > 30° region of the AGO sample where the Shapley concentration lies (corresponding to a solid angle of 
6Q « 0.087r) lowers significantly the AGO cluster density (~ 1.9 x 10"'' Mpc"^). Therefore the mean Abell/AGO 
cluster separation of our sample is dd ~ 38 — 40/i~^Mpc. 

In the following, we compare results based on the Abell/AGO sample with those derived from our simulated 
cluster populations, selected to have a similar number- density, sample volume and sky coverage. We have verified 
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that variations in d^i of the order of the Abell-ACO difference, do not significantly affect the resulting statistical 
properties. 

It is important to stress that there is a possibility that these catalogues may be contaminated by selection effects, 
perhaps due to the line-of-sight projection effects (e.g. Sutherland 1988; but see Jing, Plionis & Valdarnini 1992). The 
effect of such errors may be less pronounced on the 'morphological' measures of clustering we employ in this paper 
than on the quantities such as the two-point correlation function used in previous work. Nevertheless, the uncertain 
reliability of the catalogue, together with its relatively small size, requires us to be circumspect when presenting our 
conclusions. 



4 TOPOLOGY 

In this section we explore the behaviour of a topological characteristic of the large-scale distribution of clusters, called 
the Euler-Poincare characteristic, for the different simulated data sets. For general background material on topology, 
see Adler (1981) and Nash & Sen (1983). 



4.1 Theory 

One of the commonly-used quantitative measures of clustering pattern used in cosmology is the so-called genus 
statistic, described in detail in Melott (1990) who gives the genus g of a solid object as 

g = (no. holes) — (no. isolated regions) -I- I. (7) 

This characteristic is generally applied to the observational data by first smoothing them to form a continuous density 
field, 5, and then locating the regions where the smoothed field exceeds a given threshold density. Isodensity surfaces 
thus define solid three-dimensional objects whose topology can be defined in terms of the genus. One typically labels 
the threshold density in the dimensionless form, v, defined as the number of standard deviations of 5 above the mean: 
5 = va (the mean value of S is zero by construction). One of the great advantages of the characteristic g is that, for 
a Gaussian density field in three dimensions, its mean value per unit volume, gs, as a function of v can be obtained 
in a simple closed form; 

gs^A{l~u^)eM~'^V2) (8) 

(Doroshkevich f970; Adler 1981; Bardeen et al. 1986; Hamilton et al. 1986); the constant A depends only on the first 
and second moments of the power spectrum of 5 and can be expressed in terms of the coherence length of the random 
field, Ac: 

where 

and 

p{k)k'^dk 

The dependence (^ means that all Gaussian fields produce the same shape curve for gsiv) and that the amplitude 
can, in principle at least, be used to determine properties of the power spectrum, P{k), relatively directly from the 
data. Note that the coherence length will be determined both by the shape of the transfer function of the model in 
question and by the scale of smoothing adopted to produce the continuous density field; see below. 

The genus curve for Gaussian fields is symmetric about the mean and positive for \u\ < 1, indicating that 
threshold values around the mean give rise to contour surfaces which are multiply connected. This is characteristic 
'sponge' topology in which high density and low density regions interlock. Non-Gaussian alternatives would be a 
'meatball' topology in which isolated high density regions sit in a low-density background and the mirror-image of 
this, a 'swiss-cheese' topology. 
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The quantity gs is usually measured in practice by invoking the Gauss-Bonnet theorem to relate it to the 
integrated curvature of the contour surfaces; the algorithm CONTOUR3D is the standard tool for performing this 
calculation on smoothed observational or simulated data sets (Gott, Melott & Dickinson 1986; Hamilton, Gott & 
Weinberg 1986; Melott, Weinberg & Gott 1988; Gott et al. 1989; Melott 1990; Moore et al. 1992; Vogeley et al. 1994). 
In a more recent paper, however. Coles et al. (1996) have shown that a much more simple and efficient topology- 
measuring algorithm can be developed from ideas presented by Adler (1981). This algorithm basically computes an 
approximation to the Euler-Poincare Characteristic (EPC) x of data defined on a grid or lattice in three dimensions. 
If the genus is defined according to equation (^) then g = — x/2 so that the curves of xi'^) ^-nd g{i') are of the same 
shape except for a sign. We shall concentrate on x from now on. The algorithm we use is explained in more detail 
in Coles et al. (1996), but basically one constructs a three-dimensional framework of points, lines, squares and cubes 
linking neighbouring points above the threshold density contrast. If there are P points, L lines, S squares and C 
cubes then 

X^P~[L + C] + S. (12) 

Points are counted whether or not they belong to lines, squares or cubes; lines are counted whether or not they 
belong to squares or cubes; squares are counted whether or not they form part of cubes. This calculation is a simple 
generalisation of the 2D equivalent which counts only points, lines and squares: the 2D version has been explored in 
detail in (Coles 1988; Coles & Plionis 1991; Plionis, Valdarnini & Coles 1992; Davies & Coles 1993; Coles et al. 1993). 
Alternative algorithms are also discussed in Coles et al. (1996). 

In order to define the excursion sets appropriately one needs to smooth the initial point set with some kind 
of local averaging procedure. Clearly the smoothing radius adopted must be greater than, or of the order of the 
mean distance between points otherwise a continuous field is not created. To implement our algorithm as described 
above we also need to grid the data on a regular cubic lattice. The choices of grid resolution and smoothing scale 
are user-defined quantities and must be chosen in a pragmatic fashion. For example, the coherence of the density 
field should not be too large compared with the sample volume, otherwise edge effects dominate. A correction for 
edge effects is straightforward for periodic boundaries, such as in our simulations but is less reliable if there is a 
complicated boundary. One also wants the gridding to be fine enough that each piece of the excursion set is sampled 
by a sufficient number of cell points and that the ratio of the total number of points in the sample volume to the 
number on the edges is large. The smoothing scale adopted also depends on the number density of points selected: 
for richer clusters we need a longer smoothing length, and this may also affect the optimal choice of gridding. We 
discuss our final choices below. 

We finally remark that we prefer to plot the behaviour xi'^) ^ ^ function of v as defined above in terms of the 
standard deviation of the density fiuctuations. Other authors (e.g. Melott 1990) prefer to plot a different version of 
these curves which uses the volume fraction above the threshold to calculate the effective value v would have for 
the same volume fraction of a Gaussian random field (using the error function). Any dependence of the results on 
the one-point distribution of the fluctuations is transformed away in this latter definition, so it has the advantage of 
removing any effect of a monotonic local bias (e.g. Kaiser 1984; Coles 1993) on initially Gaussian fluctuations: the 
volume fraction remains the same in such a transformation, since excursion sets in the unbiased field are mapped into 
the same sets in the biased field. The justification for this is that one might be able to recover the topology of the 
initial density field from that of a set of locally-biased mass tracers by exploiting this property. On the other hand, 
this definition may conceal information about the form of the bias if it is non-local or non-monotonic. In the case we 
are interested in, the clusters are defined as peaks of a non-linear and therefore non-Gaussian density field and it is 
not clear what the effect of mapping back onto a Gaussian distribution will have. We therefore feel that it is better 
not to attempt to remove one-point information, as this may yield important clues about the biasing of clusters of 
different density relative to the mass distribution. One expects, for example, that clusters with higher density would 
be more biased than those of lower density and would therefore have a curve with a stronger apparent meatball shift. 

For reference, we should point out that related approaches to the analysis of superclustering have been im- 
plemented recently. Rhoads, Gott & Postman (1994) have used the more standard 'genus' algorithm to study the 
topological properties of contour surfaces constructed from the cluster distribution, including the different choice of v 
we described above. On the other hand, Kerscher et al. (1997) have used a different mathematical approach, based on 
the so-called Minkowski functionals, which incorporates as one of the descriptors a quantity analogous to the genus; 
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for a further development of these ideas, see Schmalzing & Buchert (1997). These analyses are to some extent similar 
in spirit to that which we present here, but there are significant differences in both philosophy and implementation. 

4.2 Analysis 

In this section, we discuss only the comparison of our models with each other and leave the detailed analysis of errors, 
confidence intervals and comparisons with the data until Section 6. 

We performed a series of calculations of the EPC for the simulations with the standard correction for periodic 
boundary conditions (Coles et al. 1996). To get a feel for the effect of cluster selection on the strength of the meatball 
distortion produced we have considered the distribution of all the density peaks identified on the grid, as well as the 
distribution of the highest peaks, selected so as to produce clusters with a mean spacing of 40h~^ Mpc, comparable 
to the Abell/ACO catalogue. 

Figure 1 

Results are shown in Figure 1 for the EPC characteristic xi^^)- We have chosen a Gaussian smoothing radius of 10h~^ 
Mpc, and have binned the smoothed field onto a 32^ grid. The curves shown are an average over 5 realizations of the 
model concerned, with error bars representing the standard deviation over this ensemble. Notice the slight asymmetry 
compared to the expected behaviour for a Gaussian field and the apparently anomalous shape of the ACDM, which 
appears to be very different to the other models. One should be suspicious that this difference might be due to the 
fact that this model is more highly evolved (i.e. it has a higher value of as) than the others and the strange topological 
behaviour is simply due to the fact that our Zel'dovich simulation method is behaving badly for this model. In fact, 
this is not the case. Wo applied the same test to a less evolved ACDM model (erg = 0.8; cf Borgani et al. 1995), which 
w£is demonstrated to be evolved quite accurately by our Zel'dovich technique, but found a graph of the EPC which 
matched the more evolved model closely in shape. It seems therefore that the more pronounced behaviour of the 
ACDM is actually connected with intrinsic properties of the model, rather than with any limitation of our simulation 
method. 

Figure 2 

Figure 2 shows analogous results to Figure 1, but for clusters selected to resemble Abell/ACO clusters; we have 
chosen a Gaussian smoothing radius of 30h~^ Mpc, and have binned the smoothed field onto a 32^ grid. Notice the 
slight asymmetry compared to the expected behaviour for a Gaussian field. The curves shown are again an average 
over 5 realizations. Note the reduction in amplitude, due to the increase in smoothing length resulting in an increased 
coherence length, and a more drastic meatball effect as a consequence of the bias: the point of minimum x is moved 
further to the left than in Figure 1. Notice also the substantially greater noise, even after averaging over 5 simulations. 

It is important also to note the similarity of the curves for different models in Figure 2 compared to the clear 
systematic variations of the curves with model in Figure 1. This shows that the dominant effect on the behaviour of 
the EPC in the cluster-selected samples is that of thresholding rather than in difference in the amount of evolution 
or in the shape of the initial power spectrum. In particular, notice that the shape of the EPC curve for ACDM, 
although anomalous in Figure 1 is consistent with the other models when only selected clusters are used. There are 
nevertheless residual differences due to these other factors and, as we shall show later, they do allow discrimination 
between the models with some degree of statistical confidence. 

5 PERCOLATION 
5.1 Theory 

The use of percolation methods (e.g. Stauffer & Aharony 1992; Isichenko 1992), which have been borrowed by 
cosmologists from condensed matter physics, to study aspects of galaxy clustering dates back to Zel'dovich (1982) 
and Shandarin (1983), and for quantifying observed properties of galaxy clustering to Einasto et al. (1984); see also 
Zel'dovich, Einasto & Shandarin (1982). Initially, the method used was based on the idea of "decorating" each point 
(galaxy or cluster) in a point set with a sphere of some radius and determining the point at which these spheres 
overlap to 'percolate' the entire set. Suppose we have N objects in a cubic sample of side L. The mean object-object 
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separation is defined to be / = LN~^^^. One (notionally) draws a sphere of diameter d = hi, wliere b is dimensionless, 
around each point and determines Lp{b), the maximum distance that can be traversed while still remaining within 
such spheres. The spheres around neighbouring points may, of course, overlap with each other. When Lp{b) = L then 
the set is said to percolate and the critical value of 6 = &* is called the percolation parameter. For a uniform sot of 
points on a Cartesian lattice, 6* = 1, while if the points are distributed along lines or sheets, 6* < 1. This simple 
indicator of clustering, using only 6, to quantify the connectivity of structures has not been altogether successful 
(Bhavsar & Barrow 1983; Dekel & West 1985; Dominik & Shandarin 1992), partly due to sensitivity to sample selection 
parameters but mainly due to the inadequacy of encoding in one numerical quantity (i.e. ft*) all the properties of the 
transition from the unpercolated distribution to a percolated one. 

To remedy this shortcoming, and to keep the analysis as comparable as possible to that of the preceding Section 
on the EPC, wc adopt a more sophisticated approach, as described by Klypin & Shandarin (1993); see also Mo & 
Borner (1990), de Lapparent et al. (1991), Yess & Shandarin (1996), Sathyaprakash, Sahni & Shandarin (1996) and 
Sahni, Sathyaprakash and Shandarin (1997). In this approach, we consider the application of percolation techniques 
to a cubic lattice on which, according to some density threshold criterion, cells are labelled as either 'filled' or 'empty'. 
Once a cell is so labelled, clusters of cells are identified. A cluster can be defined as a connected neighbourhood of 
cells, where cells are connected if they either share a common side (not an edge or corner) or have a neighbour which 
is connected according to the previous criterion (i.e. 'friends-of-friends'). Prom this definition of clusters (actually, in 
this context, it would be more accurate to call them 'superclusters') , we define the size of a cluster to be the number 
of cells in the cluster. An infinite cluster is so called if it connects antipodal sides of the cubic lattice and the critical 
threshold level pc is the threshold level at which the first infinite cluster is formed. At pc the system can be thought 
of as undergoing a kind of phase transition, from an unconnected to a connected state. If the cubic lattice has side 
L then the filling factor of the lattice is defined to be simply the fraction of cells labelled as "filled", this yields the 
probability p of a randomly-selected cell being filled. We then define the multiplicity function, n{r}), to be the average 
number density of clusters of size rj. Prom this a cell can be thought of as being in one of three states: empty (with 
probability 1 — p); member of a finite (non-spanning) cluster with probability 



P+ 



J2vniv); (13) 



or part of an infinite cluster with probability 

Poo = ??max/I'^ . (14) 

We then employ two sample statistics derived from these considerations. First is the fraction of cells belonging to 
infinite clusters in a given sample, which is an estimator of poo and which we shall call poo- We define the second 
parameter p^ to be the (weighted) mean square size of all clusters excluding the largest one: 



2 Er,vMv) 

the factor of L^^^ is simply to scale results for diflferent L onto each other more simply (Klypin & Shandarin 1993). 



5.2 Analysis 

For the percolation analysis of the cluster simtilations we have followed a similar approach to that of the EPC. That 
is to say, we have analysed samples containing both "all" clusters and those selected according to the Abell/ACO 
number-density criteria. We have analysed the simulations according to a variety of different grid resolutions but, for 
conciseness and to show only the most pertinent trends, we describe only those results for the same smoothing and 
gridding onto a cubic mesh as in Section 4; as before we define the density threshold Pc in terms of the number of 
standard deviations from the mean density. 

Figures 3 & 4. 

Figure 3 shows the behaviour of poo for all the clusters in each model, while Figure 4 shows the same plot for 
the Abell/ACO selected samples. Notice that the fraction of points in the infinite cluster is generally unity for low 
thresholds and zero for high thresholds independent of the model, as expected. The transition from one limit to the 
other, however, takes place at different threshold levels for the different models: it begins at around v —2 for the 
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'all data' samples and at around v = —1 for the selected clusters; the transition is more rapid in the latter case. This 
must again be mainly influenced by the shape of the one-point distribution of the density fields in the two cases 
rather than differences in the amount of evolution or in the shape of the primordial fluctuation spectrum. As was 

the case for the EPC curves, it is not obvious to the uneducated eye whether the curves which are averages over 5 
simulations, reveal significant departures from one model to another: we shall discuss this in more detail later. 

Figure 5 

Now we turn to the behaviour of displayed in Figure 5. For brevity we show only for the Abell/ACO type clusters 
and using a 32^ grid as before. Notice that the largest mean size of cluster is formed at thresholds between v = —1 
and ly = which is where the topology also indicates the highest degree of multiple connectivity. There is an apparent 
'glitch' in the case of the LOWH model displayed in the right-hand panel, but this is simply due to the fact that 
there are two effectively infinite clusters in this case: the pattern percolates in two directions at low thresholds and 
only one of the infinite clusters is removed in the averaging procedure. Other than this one feature (which occurs 
with a small but non-negligible probability) the distributions are visually similar for all models. 

At this stage we simply remark that difi'erences between the models appear to be less pronounced in terms of 
these two statistics than they do when we look at the topological EPC descriptor. This conclusion is not altered if one 
uses different gridding parameters; one simply probes the connectivity on a different scale. A more detailed analysis 
of the discrimination achieved between these models using these descriptors is deferred until Section 7. 

6 MINIMAL SPANNING TREES AND STRUCTURE FUNCTIONS 

In this section we describe the results of applying a test proposed by Pearson & Coles (1995) to these simulations. 
This test was originally suggested as a means to quantify the shape (i.e. filament or sheet-like geometry) of galaxy 
clustering, rather than its topology (i.e. connectivity) which was the case in the previous two descriptors. Rather 
than smoothing the data, one constructs the Minimal Spanning Tree (MST) of the point set and then quantifies 
the shape of pieces of the tree using a set of three shape parameters suggested by Babul & Starkman (1992); see 
also Luo & Vishniac (1995) and Dave et al. (1997). We should say at the outset that the shape of superclustering 
is expected to be much more poorly defined than that of galaxy clustering, both because the cluster distribution is 
dominated by small number statistics (which one ameliorates in the topology analysis by smoothing) and because the 
formation of structures of a well-defined dimensionality is not expected on the very large scales (which are evolving 
in a quasi-linear fashion) probed by clusters. One might expect the performance of any shape statistic therefore to 
be rather poorer than a topological descriptor. This will, in fact, be what we find. 

6.1 Theory 

Full details of our method are described in Pearson et al. (1995); we simply define our notation here. The MST (Ore 

1962; Gower & Ross 1969; Zahn 1971) is derived from graph theory and is a construct that (xmiquely) connects a set 
of N points ('nodes') with A'' — 1 straight fines ('edges') in such a way that the sum of the edge lengths is a minimum 
and there are no closed circuits in the graph thus formed. For applications of this construction to galaxy clustering 
problems, sec (Barrow, Bhavsar & Sonoda 1984,1985; Bhavsar & Ling 1988a, b; Plionis, Valdarnini & Jing 1992; 
Krzcwina & Saslaw 1996). For an interesting discussion of the relationship between the MST and the percolation 
approach used in the previous section, see Bhavsar & Splinter (1996). 

Once the tree is constructed it can be separated by removing all the edges that exceed a specified cutoff length. 
As is usual, we define the cutoff length as a multiple of the mean edge length of the MST so as to delete chance 
linkages. After separation, the MST will fall into a number of disjoint trees whose properties can be further explored 
individually. Pearson & Coles (1995) showed how separation can be used to enhance structures relative to surrounding 
noise. 

The MST is a construction rather than a statistic so in order to use it to describe galaxy clustering we have 
to quantify the shape of the tree(s). We have chosen to use the structure functions 5*1, 5*2 and S3 defined originally 
by Babul & Starkman (1992). These are calculated by first defining the moment of inertia tensor around the centre 
of mass of each piece of the separated tree. The eigenvalues of this tensor are used to define quantities Si such 
that < Si < 1 and {Si, S2, S3) = (0,0,1) for a spherical distribution, {Si, S2, S3) = (0,1,0) for a fiat sheet, and 
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{Si, S2, S3) = (1,0,0) for a straight filament. The functions are further designed to fall away rapidly from unity as 
a structure deviates from the shape specified by a particular value of Si. One can then look, for example, at the 
distribution of Si values over the pieces of the tree; see Pearson & Coles (1995) for further details. 

6.2 Analysis 

The analysis we carry out here follows that in Pearson & Coles (1995) which was applied to the simulations of Borgani 

et al. (1994) and which showed that this method could discern differences between the models presented in that paper. 
For each data set we first construct the MST, and then separate the tree as outlined above. The values of Si are then 
calculated for each separate piece of the tree. In Pearson & Coles (1995), it was found that the structure functions 
Si and ^2 were close to zero for the Borgani ct al. (1994) simulations indicating the lack of any obvious filamentary 
or sheet-like pattern. We again found this for the newer simulations. The S3 statistic, however, was shown to have 
an interesting behaviour, and we therefore look at it further here. 

Figure 6 

Figure 6 shows the distribution of S3 values for all the simulated clusters and for a separation length equal to Fx, 
where x is the mean edge length and the results are integrated over all values of F for simplicity of presentation; 
the trends with F are quite consistent in the different models. The TCDM and LOWH models have a more peaked 
distribution of S3 values than the other models, while the ACDM model is much broader but has a lower average. 
Roughly speaking, this means that the distribution in the ACDM case is less spherical than in the others which is 
consistent with the greater amount of dynamical evolution on large scales in this model than in the other cases, and 
which is also evident in the analysis we performed in Section 4. 

We have also looked at the differences in the number of trees formed at each model at different edge cutoff lengths 
(not shown). Again, the ACDM model stands out because it has the greatest distribution of values over which trees 
form: all curves peak at around F = 1.25 and fall off rapidly as F is increased. 

Our most disappointing result, however, is that when we select clusters with a mean spacing of 40/i~^ Mpc, we 
find that the MST produces insufficient trees to proceed with the analysis on simulation volumes of this size. When 
the tree is constructed and separation performed with any reasonable value of F, one simply gets trees containing 
only one node. We are therefore unable to use this descriptor for more detailed tests of selected clusters. As we feared, 
small number effects prevents us using this method for samples as sparsely sampled and within such a small volume 
as our simulations. 

The conclusion of this section is, therefore, that while the MST method proposed by Pearson & Coles (1995) can 
indeed discriminate between different underlying distributions, the shot-noise associated with clusters, and the lack 
of a pattern with a specific dimensionality, seems to pose insuperable problems for testing the cluster distribution 
using data sets of size comparable to those we have used in this study. 



7 STATISTICAL TESTS 

The results we have discussed so far have been displayed for visual interpretation only, and without a detailed study 
of the errors and resulting confidence limits. We have shown results only for the theoretical simulations, and not for 
the real cluster catalogue. Now there are two main tasks one might be interested in setting for clustering descriptors 
of the kind we have discussed so far in this paper. One is to indicate differences between the pattern displayed by the 
different models so one can understand the impact of initial conditions and evolution on the clustering pattern. The 
second task is to test specific models against real data to find out whether they are compatible. 

What we have done, therefore, is to recalculate the EPC and percolation statistics for the Abell/ACO sample 
described in Section 3 (since the MST method performs so poorly for the idealised samples, we do not discuss it 
further in this section: it is even worse when applied to the real catalogue, which is smaller). To obtain maximum 
discriminatory power we do not restrict ourselves to one set of gridding or smoothing parameters: we use a bank 
of results for 16^, 32^ and 64^ grids and for four different choices of smoothing length: 10h~^ , 20h~^, 30h~^ and 
40h~^ Mpc and for an ensemble of 5 simulations to calculate significance levels for differences in behaviour of these 
descriptors (i) between the models and (ii) between the models and the data. We constructed distributions for each of 
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Table 2. Power of EPC as a discriminator. The diagonal (in boldface) shows the comparison with the real Abell/ACO data, 
above right of the diagonal shows discrimination between models using the distribution of all the density peaJcs, below left shows 
discrimination between models using selected clusters only. 





CHDM 


OCDM 


SCDM 


ACDM 


LOWH 


TCDIV 


CHDM 


0.92 


0.58 


0.17 


1.00 


0.58 


0.75 


OCDM 


0.33 


0.92 


0.25 


0.91 


0.75 


0.91 


SCDM 


0.50 


0.83 


0.92 


1.00 


0.83 


0.91 


ACDM 


0.66 


0.58 


0.91 


0.42 


1.00 


1.00 


LOWH 


0.50 


0.91 


0.91 


0.91 


1.00 


0.23 


TCDM 


0.50 


0.66 


0.09 


0.75 


0.00 


1.00 
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Table 3. Power of fj,oo as a discriminator. The diagonal (in boldface) shows the comparison with the real Abell/ACO data, 
above right of the diagonal shows discrimination between models using the distribution of all the density peaJcs, below left shows 
discrimination between models using selected clusters only. 





CHDM 


OCDM 


SCDM 


ACDM 


LOWH 


TCDIV 


CHDM 


0.92 


0.33 


0.25 


0.66 


0.33 


0.91 


OCDM 


0.33 


0.92 


0.91 


0.75 


0.66 


0.83 


SCDM 


0.09 


0.50 


1.00 


0.66 


0.50 


0.83 


ACDM 


0.66 


0.25 


0.75 


0.50 


1.00 


1.00 


LOWH 


0.17 


0.50 


0.17 


0.83 


1.00 


0.33 


TCDM 


0.25 


0.41 


0.17 


1.00 


0.00 


0.92 



the quantities involved in the analyses we have described, and then used a Monte-Carlo test using the Kolmogorov- 
Smirnov statistic to find the fraction of times that the distributions were found to be different. This then yields a 
robust estimate of the statistical significance of differences between the models under each of these descriptors. We 
looked at this question for each descriptor separately and then, at the end, when all descriptors are used in concert. 

Figure 7 

The need for such a detailed statistical study is demonstrated by the form of Figure 7 which shows the results obtained 
for the EPC smoothed and gridded as in Section 4 for the Abell/ACO sample and for samples extracted from the 
simulations according to the same selection criteria (i.e. radial distribution and sky coverage). It is by no means 
obvious whether there are any systematic departures, although some of the model curves appear to be discrepant. 
Merely plotting error bars on this curve would not help much as differences in the shape are more important than 
differences in the amplitude. 

Wc now display the results of this procedure in a series of tables which all have the same format: along the 
downward diagonal we see the significance level of departures of the simulated Abell/ACO clusters (i.e. clusters 
selected according to the same criteria as the Abell/ACO sample) against the real data; above the diagonal shows 
the significance level of departures of models from each other based on the properties of all clusters in the simulation; 
below the diagonal we have discrimination between models based on the properties of clusters selected with a mean 
spacing of 40ft" ^ Mpc (but still in a cubic volume). To give an example, in Table 2, we see that the ACDM model 
disagrees with the Abell/ACO data at a 42% confidence level; while it is different from OCDM at the 91% level if all 
data arc included and different to CHDM at the 66% level if only selected clusters are used. 

Table 2 shows the results for the EPC only. Looking first at the diagonal reveals that all but the LOWH and 
TCDM models are consistent with the Abell/ACO data within 95% confidence. The best fit is ACDM. The ability of 
the method to discriminate between models is variable and can be very poor if only the selected clusters are used: for 
example, LOWH and TCDM appear identical in this case. Table 3 shows the results for /ioo only. This appears to rule 
out both SCDM and LOWH at the 95% level, while ACDM is again the best fit. Although the discrimination is again 
variable, and this statistic does not simply track the behaviour of the EPC test, the average power of discrimination 
is somewhat lower for this statistic than for the EPC. Table 4 shows the behaviour of fj,^. In terms of this statistic, all 
models appear to be discrepant with the data. This may however be due to the fact that for these small samples only 
a small number of 'clusters' are involved in the calculation of equation (15) and the results are therefore oversensitive 
to fluctuations from sample to sample. Our power tests are an attempt to quantify the robustness of the statistic 
to fluctuations of this type, but if the variation over the ensemble is too large they will not be reliable. It is also 
possible that boundary effects dominate the behaviour of this quantity for the real data because these interact in a 
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Table 4. Power of as a discriminator. The diagonal (in boldfa<;e) shows the comparison with the real Abell/ACO data, above 
right of the diagonal shows discrimination between models using the distribution of all the peaJjs, below left shows discrimination 
between models using selected clusters only. 





CHDM 


OCDM 


SCDM 


ACDM 


LOWH 


TCDIV 


CHDM 


1.00 


0.66 


0.66 


0.91 


0.91 


1.00 


OCDM 


0.58 


1.00 


0.50 


1.00 


1.00 


1.00 


SCDM 


0.58 


0.75 


1.00 


0.75 


0.91 


0.91 


ACDM 


0.66 


0.83 


0.50 


1.00 


1.00 


1.00 


LOWH 


0.73 


0.83 


0.83 


0.66 


1.00 


0.91 


TCDM 


0.83 


0.66 


0.75 


0.66 


0.50 


1.00 



[tp] 

Table 5. Power of all tests combined into a single discriminator. The diagonal (in boldface) shows the comparison with the 
real Abell/ACO data, above right of the diagonal shows discrimination between models using the distribution of all the density 
peaks, below left shows discrimination between models using selected clusters only. 

CHDM OCDM SCDM ACDM LOWH TCDM 



CHDM 


0.94 


0.53 


0.36 


0.86 


0.61 


0.89 


OCDM 


0.39 


0.94 


0.28 


0.88 


0.80 


0.91 


SCDM 


0.39 


0.69 


0.97 


0.81 


0.75 


0.89 


ACDM 


0.67 


0.56 


0.72 


0.64 


1.00 


1.00 


LOWH 


0.47 


0.75 


0.36 


0.81 


1.00 


0.50 


TCDM 


0.53 


0.58 


0.33 


0.83 


0.17 


1.00 



different way in the percolation analysis than in the EPC analysis. In any case, there certainly seems to be stronger 
discrimination between models with /u^ than with /Xoo alone. 

Since the tests arc not overwhelmingly powerful on an individual basis, we look at the results of combining 
a battery of these three into one 'supertest' which would make use of any complementarity that exists in these 
descriptors. Table 5 shows the effectiveness of combining all the tests into one. Notice that LOWH, SCDM and 
TCDM are all excluded with at least 95% confidence, but that the best discrimination that can be achieved between 
the models, though better than in the previous tables, is generally less than 95%. 

8 DISCUSSION AND CONCLUSIONS 

In the Introduction to this paper, we stressed that this analysis was to be treated as exploratory because there were 
reasonable grounds to doubt thv quality of present clustering data and that looking for geometrical signatures of the 
pattern of supcrclustering was in any case difficult because of the extreme rareness of rich clusters and the consequent 
sparse sampling and shot-noise this implies. 

Nevertheless, as a guide to the results one might expect from larger and better controlled cluster samples the 
results we have obtained arc extremely encouraging, at least for some of the tests we have used. Although this 
optimism is largely based on results from simulations which may be reasonably argued to be much 'cleaner' than 
real data are likely to be, our results show at least that there are perceptible differences between these models on 
large-scales and that these do in principle allow one to discriminate between them using shape- and topology-based 
descriptors. 

For our topological analysis, based on the EPC, clear differences emerge between the models. One has to be a 
little careful here, however, because the form of the statistic we use actually contains information about the one- 
point distribution function of the objects, because of the choice of threshold parameter v. Remember also that the 
amplitude of the EPC curve is related to the coherence length of the density field and that this is simply derived from 
the power spectrum. Comparing the trends we see in the EPC analysis with the trends of the one-point distribution 
found in an analysis of the same models by Borgani et al. (1995) together with the coherence lengths of the initial 
power spectra, shows that the behaviour of the EPC for different simulations can, roughly speaking, be 'explained' 
in terms of these other descriptions. Although differences therefore show up between the models, they are largely 
the same as the differences one finds in non-topological descriptors. One would be justified therefore in saying that 
this descriptor does not add very much: it just provides a different way of seeing differences in one and two-point 
information. Nevertheless, folding such information in with the topology (which is in any case very easy to measure) 
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does seem to provide a simple methodology for discriminating between models which does not require the computation 
of power-spectra and distribution functions and may in any case incorporate at least some extra information than 
these quantities do. 

On the other hand, the topology of the Abell/ACO data docs not display the same kind of EPC graph that one 
would expect by looking at the results of Plionis et al. (1995) and Borgani et al. (1995) and assuming it follows the 
same trends as our models. This may be telling us that the Abell/ACO is essentially different to all of the models 
we have looked at in this paper, which in turn may mean that either all the models are incorrect or that there is 
something suspicious about the catalogues or the way we have interpreted them. In particular, the effects of redshift 
selection, galactic extinction and the differences in number density between the Abell and AGO catalogues introduce 
some uncertainty into our conclusions. 

The one model that docs have a topological description in reasonable accord with the Abell/ACO data is the 
ACDM model, a result which agrees with the results of Kerscher et al. (1997) (although the model they used had a 
rather smaller value of Ha = 0.65 than the model we have used here). This model also survives the tests described 
in Borgani et al. (1995), but there was uncertainty attached to that analysis because of the possibility of that model 
being too strongly clustered to be adequately described by the Zel'dovich approximations. We have shown that this 
extra evolution does not influence the behaviour of the EPC to any significant extent and the claim that this model 
can reproduce the behaviour of Abell/ACO in terms of topology and low-order moments therefore stands up to 
scrutiny. This, of course, still admits the possibility that this is telling us more about problems with the catalogue 
than about the real distribution of overdensities. 

The performance of our percolation test depends strongly on the kind of statistic one extracts from the percolated 
set. If one looks only at the statistic fioo then the power of discrimination is mediocre, but this rises strongly if one 
uses fi^ instead or together with /ioo. 

The one disappointment of this analysis is the performance of the MST/shape functions we introduced in Pearson 
& Coles (1995). Althougli they do perform well for relatively well-sampled distributions, we were unable to get useful 
results for any of the simulated samples of clusters. The application of this statistic, at least in the form we have used 
it here, is not recommended for extremely sparsely-sampled distributions like those of Abell clusters. 

Our final conclusion, however, is that topological and geometrical descriptors (of which we have studied only 
three) are at least in principle capable of diagnosing differences between very sparsely-sampled distributions in a 
fashion which is quite independent of the one- and two-point statistics which are more familiar in the cosmological 
community. With the arrival of larger and better controlled samples of galaxy rodshifts and the cluster catalogues 
which will accompany them, clustering data will not only be more amenable to this type of analysis, they will also 
require such an approach if one is to extract as much information as possible. 
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FIGURE CAPTIONS 

Figure 1. Results for the EPC (x) as a function of density threshold, v, expressed in standard deviations from the 
mean, for all models described in the text using all peaks found in the simulations to define the structure. Panel (a) 
shows CHDM (solid fines), ACDM (dotted) and LOWH (dasfied); Panel (b) sfiows OCDM (solid), SCDM (dotted) 
and TCDM (dashed). Error bars are shown in (a) for CHDM and in (b) for OCDM only to avoid crowding the plot. 
The vertical scaling is arbitrary, but is identical for all the models. 

Figure 2. Results for the EPC (x) as a function of density threshold, expressed in standard deviations from the 
mean, v for all models described in the text using clusters selected in the simulations so as to have a fiducial mean 
spacing of 40/i"^ Mpc. Panel (a) shows CHDM (solid lines), ACDM (dotted) and LOWH (dashed); Panel (b) shows 
OCDM (soUd), SCDM (dotted) and TCDM (dashed). Error bars are shown in (a) for CHDM and in (b) for OCDM 
only to avoid crowding the plot. The vertical scaling is arbitrary, but is identical for all the models. 

Figure 3. Results for the percolation statistic fj,oo for all models described in the text using all the peaks found in 

the simulation to define the structure. Panel (a) shows CHDM (solid lines), ACDM (dotted) and LOWH (dashed); 
Panel (b) shows OCDM (sofid), SCDM (dotted) and TCDM (dashed). Error bars are shown in (a) for CHDM and 
in (b) for OCDM only to avoid crowding the plot. 

Figure 4. Results for the percolation statistic Hoo for all models described in the text using selected clusters only, 
as in Figure 2.Panel (a) shows CHDM (soUd lines), ACDM (dotted) and LOWH (dashed); Panel (b) shows OCDM 
(solid), SCDM (dotted) and TCDM (dashed). Error bars are shown in (a) for CHDM and in (b) for OCDM only to 

avoid crowding the plot. 

Figure 5. Results for the percolation statistic /i^ for all models described in the text using selected clusters only, 
as in Figure 2. Panel (a) shows CHDM (solid fines), ACDM (dotted) and LOWH (dashed); Panel (b) shows OCDM 
(sohd), SCDM (dotted) and TCDM (dashed). Error bars are shown in (a) for CHDM and in (b) for OCDM only to 
avoid crowding the plot. 

Figure 6. Integrated distribution of the shape-space statistic S3 for all clusters selected and for all models. The 
distributions are obtained by co-adding distributions for various values of F, as described in the text. Panel (a) shows 
CHDM (solid lines), ACDM (dotted) and LOWH (dashed); Panel (b) shows OCDM (solid), SCDM (dotted) and 
TCDM (dashed). Error bars are now shown, as the results come from co-adding all the simulation results. 

Figure 7. Results for the EPC as a function of density threshold, expressed in standard deviations from the mean, 
for all models described in the text. Samples were extracted according to the same selection criteria as the Abell/ACO 

sample which is also shown for comparison. The figures show: (a) CHDM; (b) ACDM; (c) TCDM; (d) LOWH; (e) 
OCDM; (f) SCDM; appropriate error bars are drawn on these curves. The heavy solid line in each plot shows the 
corresponding results for the Abell/ACO catalogue. The noisiness of these curves demonstrates the need for careful 
statistical assessment of the discriminatory power. 
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